fix: 5 critical bugs — TEE randomization, HTTP 402 hint, int dtype, inference None crash, gas estimation by amathxbt · Pull Request #201 · OpenGradient/OpenGradient-SDK

amathxbt · 2026-03-23T23:49:53Z

Summary

This PR fixes 5 critical bugs identified from open issues and code audit, spanning TEE selection, LLM error handling, type conversions, inference safety, and gas estimation.

Bug 1 — TEE always picks the same node (closes #200)

File: src/opengradient/client/tee_registry.py

Root cause: get_llm_tee() always returned tees[0], routing 100% of traffic to a single TEE with zero load distribution or failover.

Fix: Replace tees[0] with random.choice(tees) so each LLM() construction independently selects from the full pool of active, registry-verified TEEs.

# Before
return tees[0]

# After
selected = random.choice(tees)
logger.debug("Selected TEE %s from %d active LLM proxy TEE(s)", selected.tee_id, len(tees))
return selected

Bug 2 — HTTP 402 swallowed as cryptic RuntimeError (closes #188)

File: src/opengradient/client/llm.py

Root cause: All HTTP errors (including 402 Payment Required) were caught by except Exception and re-raised as a generic RuntimeError("TEE LLM chat failed: ..."). Users had no idea they needed to call ensure_opg_approval() first.

Fix: Add import httpx and intercept httpx.HTTPStatusError before the generic handler. When status_code == 402, raise a RuntimeError with an explicit, actionable message pointing to ensure_opg_approval(). Applied to _chat_request(), completion(), and _parse_sse_response() (streaming path).

# New constant
_402_HINT = (
    "Payment required (HTTP 402): your wallet may have insufficient OPG token allowance. "
    "Call llm.ensure_opg_approval(opg_amount=<amount>) to approve Permit2 spending "
    "before making requests. Minimum amount is 0.05 OPG."
)

# In _chat_request / completion / _parse_sse_response:
except httpx.HTTPStatusError as e:
    if e.response.status_code == 402:
        raise RuntimeError(_402_HINT) from e
    raise RuntimeError(f"TEE LLM chat failed: {e}") from e

Bug 3 — Fixed-point int returns np.float32 instead of int (partially closes #103)

File: src/opengradient/client/_conversions.py

Root cause: convert_to_float32(value, decimals) always returned np.float32, even when decimals == 0 (i.e., the original value was an integer). Users expecting integer outputs received np.float32 and had to cast manually.

Fix: Add convert_fixed_point_to_python(value, decimals) that returns int when decimals == 0 and np.float32 otherwise. np.array() then infers the correct dtype (int64 vs float32) automatically. The old convert_to_float32 is kept as a deprecated alias for backward compatibility.

def convert_fixed_point_to_python(value: int, decimals: int) -> Union[int, np.float32]:
    if decimals == 0:
        return int(value)      # ← integer tensor stays integer
    return np.float32(Decimal(value) / (10 ** Decimal(decimals)))

Bug 4 — `infer()` crashes with AttributeError when inference node returns None

File: src/opengradient/client/alpha.py

Root cause: _get_inference_result_from_node() returns None when the node has no result yet. This None was passed directly to convert_to_model_output(), which calls event_data.get("output", {}) → AttributeError: NoneType object has no attribute get. Additionally, if ModelInferenceEvent logs were empty, parsed_logs[0] caused an IndexError.

Fix:

# Guard 1: empty precompile logs
if not precompile_logs:
    raise RuntimeError("ModelInferenceEvent not found in transaction logs.")

# Guard 2: None from inference node
if inference_result is None:
    raise RuntimeError(
        f"Inference node returned no result for inference ID {inference_id!r}. "
        "The result may not be available yet — retry after a short delay."
    )

Bug 5 — `run_workflow()` uses hardcoded 30M gas

File: src/opengradient/client/alpha.py

Root cause: run_workflow() hardcoded gas=30000000, unlike infer() which correctly uses estimate_gas(). This is both wasteful (users overpay) and fragile on networks with lower block gas limits.

Fix: Call estimate_gas() first and multiply by 3 for headroom. Fall back to 30M only if estimation itself fails.

try:
    estimated_gas = run_function.estimate_gas({"from": self._wallet_account.address})
    gas_limit = int(estimated_gas * 3)
except Exception:
    gas_limit = 30000000  # fallback

Also replaces the hardcoded timeout=60 in new_workflow() with the INFERENCE_TX_TIMEOUT constant.

Files Changed

File	Bugs Fixed
`src/opengradient/client/tee_registry.py`	Bug 1 (TEE randomization)
`src/opengradient/client/llm.py`	Bug 2 (HTTP 402 hint)
`src/opengradient/client/_conversions.py`	Bug 3 (int dtype)
`src/opengradient/client/alpha.py`	Bugs 4 & 5 (None guard + gas)

Related Issues

Closes #200
Closes #188
Partially closes #103

Previously get_llm_tee() always returned tees[0], the first TEE in the registry list. This caused all clients to hit the same TEE, providing no load distribution and no resilience when that TEE starts failing. Fix: use random.choice(tees) so each LLM() construction independently picks from all currently active TEEs. Successive retries or re-initializations will therefore naturally spread across the healthy pool. Closes OpenGradient#200

Previously any HTTP error from the TEE (including 402 Payment Required) was silently wrapped into a generic RuntimeError("TEE LLM chat failed: ..."). This caused the confusing traceback seen in issue OpenGradient#188 — the real cause (insufficient OPG allowance) was buried inside the exception message. Fix: - Add `import httpx` and intercept httpx.HTTPStatusError before the generic `except Exception` handler in both _chat_request() and completion(). - When status == 402, raise a RuntimeError with a clear, actionable hint telling the user to call llm.ensure_opg_approval(opg_amount=<amount>). - Also handle 402 in _parse_sse_response() for the streaming path. - All other HTTP errors continue to surface as before. Closes OpenGradient#188

Previously convert_to_float32() always returned np.float32 regardless of the decimals field, forcing callers to manually cast integer results (issue OpenGradient#103 — add proper type conversions from Solidity contract to Python). Fix: - Add convert_fixed_point_to_python(value, decimals) that returns int when decimals == 0 and np.float32 otherwise. np.array() automatically picks the correct dtype (int64 vs float32) based on the element types. - Update both convert_to_model_output() and convert_array_to_model_output() to call the new function. - Keep convert_to_float32() as a deprecated backward-compatible alias so any external code that imports it directly continues to work. Partially closes OpenGradient#103

… in run_workflow Bug 4 — None crash in infer(): _get_inference_result_from_node() can legitimately return None when the inference node has no result yet. Previously this None was passed directly to convert_to_model_output(), which calls event_data.get(), causing an AttributeError: 'NoneType' object has no attribute 'get'. Fix: after calling _get_inference_result_from_node(), check for None and raise a clear RuntimeError with a human-readable message and the inference_id so callers know what to retry. Also guard the precompile log fetch: if parsed_logs is empty, raise RuntimeError instead of crashing with an IndexError on parsed_logs[0]. Bug 5 — hardcoded 30M gas in run_workflow(): run_workflow() built the transaction with gas=30000000 (30 million) unconditionally. This is wasteful (users overpay for gas) and can fail on networks with a lower block gas limit. Fix: call estimate_gas() first (consistent with infer()), then multiply by 3 for headroom. Fall back to 30M only if estimation itself fails. Also fixes new_workflow() deploy to use INFERENCE_TX_TIMEOUT constant instead of the previous hardcoded 60 seconds.

amathxbt · 2026-03-23T23:50:06Z

Hey @adambalogh 👋 — this PR addresses 5 critical bugs found during a code audit, 3 of which directly close or partially close open issues you and the team have filed:

#	Bug	Issue
1	`get_llm_tee()` always returned `tees[0]` — no randomization or load distribution	Closes #200
2	HTTP 402 from TEE was swallowed as a cryptic `RuntimeError` with no guidance	Closes #188
3	Fixed-point values with `decimals=0` returned `np.float32` instead of `int`	Partially closes #103
4	`infer()` crashed with `AttributeError` when the inference node returned `None`	Code audit
5	`run_workflow()` hardcoded `gas=30000000` instead of using `estimate_gas()`	Code audit

All changes are backward-compatible. Happy to adjust anything before review! 🙏

amathxbt added 4 commits March 24, 2026 00:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: 5 critical bugs — TEE randomization, HTTP 402 hint, int dtype, inference None crash, gas estimation#201

fix: 5 critical bugs — TEE randomization, HTTP 402 hint, int dtype, inference None crash, gas estimation#201
amathxbt wants to merge 4 commits intoOpenGradient:mainfrom
amathxbt:fix/critical-5-bugs-tee-402-dtype-inference-gas

amathxbt commented Mar 23, 2026

Uh oh!

amathxbt commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amathxbt commented Mar 23, 2026

Summary

Bug 1 — TEE always picks the same node (closes #200)

Bug 2 — HTTP 402 swallowed as cryptic RuntimeError (closes #188)

Bug 3 — Fixed-point int returns np.float32 instead of int (partially closes #103)

Bug 4 — infer() crashes with AttributeError when inference node returns None

Bug 5 — run_workflow() uses hardcoded 30M gas

Files Changed

Related Issues

Uh oh!

amathxbt commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bug 4 — `infer()` crashes with AttributeError when inference node returns None

Bug 5 — `run_workflow()` uses hardcoded 30M gas